pp108 : Handling large XML documents in Process Platform - Streaming XML content

Handling large XML documents in Process Platform - Streaming XML content

This topic describes the handling of large XML documents and the streaming XML support provided to business process model.


Certain business scenarios involve processing large volumes of data as part of executing a business process instance. In such cases, instead of processing all data in a single, performance-intensive step, it is better to handle data, represented as an XML file, as a steady and continuous stream. This translates into a need to provide control over the manner in which data is processed by the Process Engine.

By this approach, you will have the following benefits:

  • Less memory consumption
  • Increased performance

Salary Processing of Railway Employees


Let as assume a scenario of salary processing of employees in a government organization like railways. They perform this in a batch process. They will have the employee information in a legacy system and for salary processing they will query the legacy system and create a batch file. These batch files are huge in size (In GB). In the business process model, for every employee, you have to get his/her work details, process the salary and generate the pay slip.

Currently, you cannot stream the XML data from a file into the Business Process Management Service container. So, you have to load the file completely into the Business Process Management Service container memory which will eventually cause a out of memory problem. In order to avoid this, as a developer, you have to convert the employee records in the XML batch file into records in a table, generate the query method and use these methods in the business process model. Developing, packing and deploying batch processing models is a tedious and time-consuming approach.

Streaming Support


  1. Streaming support in XPath: XPath now provides a streaming API. Using this API, you can navigate through a collection of records without loading the entire XML into memory.
  2. Streaming support in the For Each construct

  • In the For Each BPMN construct, you need to provide an XPATH expression which returns a collection of XML Nodes (Records) on which you iterate through. This For Each construct is now extended to support streaming XPATH Expression.
  • The For Each construct works on the Normal mode and Streaming XPATH Mode.
  • When a streaming XPath mode is selected, you are required to provide the file URL via a message in message map to the For Each construct.
  • During the runtime, the engine resolves the URL from the message, creates an XPathReader with the URL and streams the records from the file one by one for each iteration.
  • By default, after each interaction, you can delete the streamed record. However, if you want to maintain the record, you are provided with an option to do that.
  • In case of crash recovery, you can start streaming from the place you were when the crash happened.